12 research outputs found

    Perceptually Motivated Shape Context Which Uses Shape Interiors

    Full text link
    In this paper, we identify some of the limitations of current-day shape matching techniques. We provide examples of how contour-based shape matching techniques cannot provide a good match for certain visually similar shapes. To overcome this limitation, we propose a perceptually motivated variant of the well-known shape context descriptor. We identify that the interior properties of the shape play an important role in object recognition and develop a descriptor that captures these interior properties. We show that our method can easily be augmented with any other shape matching algorithm. We also show from our experiments that the use of our descriptor can significantly improve the retrieval rates

    Visual Concepts and Compositional Voting

    Get PDF
    It is very attractive to formulate vision in terms of pattern theory \cite{Mumford2010pattern}, where patterns are defined hierarchically by compositions of elementary building blocks. But applying pattern theory to real world images is currently less successful than discriminative methods such as deep networks. Deep networks, however, are black-boxes which are hard to interpret and can easily be fooled by adding occluding objects. It is natural to wonder whether by better understanding deep networks we can extract building blocks which can be used to develop pattern theoretic models. This motivates us to study the internal representations of a deep network using vehicle images from the PASCAL3D+ dataset. We use clustering algorithms to study the population activities of the features and extract a set of visual concepts which we show are visually tight and correspond to semantic parts of vehicles. To analyze this we annotate these vehicles by their semantic parts to create a new dataset, VehicleSemanticParts, and evaluate visual concepts as unsupervised part detectors. We show that visual concepts perform fairly well but are outperformed by supervised discriminative methods such as Support Vector Machines (SVM). We next give a more detailed analysis of visual concepts and how they relate to semantic parts. Following this, we use the visual concepts as building blocks for a simple pattern theoretical model, which we call compositional voting. In this model several visual concepts combine to detect semantic parts. We show that this approach is significantly better than discriminative methods like SVM and deep networks trained specifically for semantic part detection. Finally, we return to studying occlusion by creating an annotated dataset with occlusion, called VehicleOcclusion, and show that compositional voting outperforms even deep networks when the amount of occlusion becomes large.Comment: It is accepted by Annals of Mathematical Sciences and Application

    Exploiting shape properties for improved retrieval, discrimination and recognition

    No full text
    Recognition of categories of objects is one of the central problems of computer vision. The human visual system has an unmatched ability to recognize objects across multiple modalities and appearances. Recognition of objects via their shapes is one of the primary reasons why humans are able to perform well in vision-related tasks. However, automatic algorithms that try to recognize objects are far from such ability. Therefore, this thesis concentrates on developing better representations of shapes to aid in object retrieval and object detection. We begin by questioning the implicit assumption that all of the shape information lies in the contours, and show that making use of the interior properties of the shapes will produce better results while matching shapes. To this end, we introduce a novel descriptor, namely, the Solid Shape Context. We then hypothesize that not all parts of the shape are equally important while extracting the shape properties, and try to identify the most discriminative parts. We extract the most discriminative parts of a shape by comparing each shape to its closest rivals and propose a means to improve the discriminative capability of standard shape descriptors. We then propose a simple and intuitive way to obtain robust neighborhoods, which help in computing the ``true" distances between shapes. This is achieved by mining additional information that was, until now, unidentified. In addition, we provide soft probabilistic measures for the inclusion or removal of a node from a local neighborhood. The ability to measure confidence of nodes being a part of the neighborhood was not explored till now. This work opens many avenues for future research in the rapidly growing field of retrieval. Finally, as an unifying work, we propose a framework for performing shape-based object detection in real-world images, which also allows for the identification of object parts. This work bridges the gap between two independent, but actively researched, threads in the field of object detection. We propose a structured prediction approach for predicting object part labels, where the label of each part gets influenced by its neighboring parts.DOCTOR OF PHILOSOPHY (SCE

    Improving shape context using geodesic information and reflection invariance

    No full text
    In this paper, we identify some of the existing problems in shape context matching. We first identify the need for reflection invariance in shape context matching algorithms and propose a method to achieve the same. With the use of these reflection invariance techniques, we bring all the objects, in a database, to their canonical form, which halves the time required to match two shapes using their contexts. We then show how we can build better shape descriptors by the use of geodesic information from the shapes and hence improve upon the well-known Inner Distance Shape Context (IDSC). The IDSC is used by many pre- and post-processing algorithms as the baseline shape-matching algorithm. Our improvements to IDSC will remain compatible for use with those algorithms. Finally, we introduce new comparison metrics that can be used for the comparison of two or more algorithms. We have tested our proposals on the MPEG-7 database and show that our methods significantly outperform the IDSC.Published versio

    Dense sampling of shape interiors for improved representation

    No full text
    Matching shapes accurately is an important requirement in various applications; the most notable of which is object recognition. Precisely matching shapes is a difficult task and is an active area of research in the computer vision community. Most shape matching techniques rely on the contour of the object to provide the object's shape properties. However, we show that using the contour alone cannot help in matching all kinds of shapes. Many objects are recognised because of their overall visual similarity, rather than just their contour properties. In this paper, we assert that modelling the interior properties of the shape can help in extracting this overall visual similarity. We propose a simple way to extract the shape's interior properties. This is done by densely sampling points from within the shape and using it to describe the shape's features. We show that using such an approach provides an effective way to perform matching of shapes that are visually similar to each other, but have vastly different contour properties.Published versio

    PREMACHANDRAN V.: Three-Dimensional Bilateral Symmetry Plane Estimation in the Phase Domain

    No full text
    Abstract We show that bilateral symmetry plane estimation for three-dimensional (3-D

    Can relative skill be determined from a photographic portfolio?

    No full text
    In this study, our primary aim is to determine empirically the role that skill plays in determining image aesthetics, and whether it can be deciphered from the ratings given by a diverse group of judges. To this end, we have collected and analyzed data from a large number of subjects (total 168) on a set of 221 of images taken by 33 photographers having different photographic skill and experience. We also experimented with the rating scales used by previous studies in this domain by introducing a binary rating system for collecting judges’ opinions. The study also demonstrates the use of Amazon Mechanical Turk as a crowd-sourcing platform in collecting scientific data and evaluating the skill of the judges participating in the experiment. We use a variety of performance and correlation metrics to evaluate the consistency of ratings across different rating scales and compare our findings. A novel feature of our study is an attempt to define a threshold based on the consistency of ratings when judges rate duplicate images. Our conclusion deviates from earlier findings and our own expectations, with ratings not being able to determine skill levels of photographers to a statistically significant level.Published versio
    corecore